读取文本读取序列的确定是对记录理解的基础。在文本组织成一系列行和垂直对准的页面中,可以轻松解决此问题,并运行页面的高度(生成可以从左到右读取的多列)。我们展示了一种情况 - 目录页面解析问题 - 以不规则,视觉组织的二维格式在页面上呈现信息。目录页面在金融招股说明书中相当常见,并携带有关组织,其地址和关系的信息,这是客户在车内客户端的关键。有趣的是,目录页有时有分层结构,激励需要将读取序列概括为读取树。我们向识别目录页面和构建读取树的问题提供解决方案,使用(学习)文本段和自下而上的(向左,左上,顶部顶部)遍历的段的横向。该解决方案是支持从客户端船上文件自动提取组织,地址和关系信息的生产服务的关键部分。
translated by 谷歌翻译
医疗报告的生成是一项具有挑战性的任务,因为它耗时,需要经验丰富的放射科医生的专业知识。医疗报告生成的目的是准确捕获和描述图像发现。先前的作品在不同域中使用大型数据集预处理其视觉编码神经网络,这些数据集无法在特定的医疗领域中学习一般的视觉表示。在这项工作中,我们提出了一个医学报告生成框架,该框架使用对比度学习方法来预处理视觉编码器,并且不需要其他元信息。此外,我们在对比度学习框架中采用肺部分割作为增强方法。该分割指导网络专注于编码肺部区域内的视觉特征。实验结果表明,所提出的框架可以在定量和定性上提高生成的医疗报告的性能和质量。
translated by 谷歌翻译
大型语言模型已经证明了能够在自然语言和编程语言文本上进行条件和生成的能力。这样的模型打开了多语言代码生成的可能性:代码生成模型是否可以将知识从一种语言推广到另一种语言?尽管当代代码生成模型可以生成语义上正确的Python代码,但对它们使用其他语言的能力知之甚少。我们通过提出Multipl-E来促进该主题的探索,这是自然语言到代码生成的第一个多语言平行基准。 Multipl-E扩展了HumaneVal基准(Chen等,2021),以支持另外18种编程语言,涵盖了一系列编程范式和受欢迎程度。我们在Multipl-E:Codex和Incoder上评估了两个最先进的代码生成模型。我们发现,在几种语言上,法典匹配,甚至超过了其在Python上的性能。在多型E中表示的编程语言范围使我们能够探索语言频率和语言功能对模型性能的影响。最后,将代码生成基准分配给新编程语言的多重方法既可扩展又可扩展。我们描述了一种通用方法,可以轻松地增加对新基准和语言的支持。
translated by 谷歌翻译
在大多数数据科学方法中,最大熵的原理(Maxent)用于后验证明某些参数模型的合理性,这些模型已根据经验,先验知识或计算简单性选择。在传统模型构建的垂直公式中,我们从现象学约束的线性系统开始,渐近地在满足提供的约束集集的所有可行分布上得出了分布。最大分布起着特殊的作用,因为它是所有现象学上可行的分布中最典型的,代表了大N技术的良好膨胀点。这使我们能够以完全DATA驱动的方式始终如一地制定假设检验。数据支持的适当参数模型可以在模型选择结束时始终推导。在Maxent框架中,我们恢复了多个应用程序中使用的主要分数和选择程序,并评估其在数据生成过程中捕获关联并确定最概括的模型的能力。标准模型选择的数据驱动的对应物展示了最大原则提倡的演绎逻辑的统一前景,同时有可能为反问题提供新的见解。
translated by 谷歌翻译
许多文献表明,基于及时的学习是使用大型预训练的语言模型的有效方法。最近的作品还展示了通过插入适当的提示来指导聊天机器人输出的可能性。基于梯度的方法通常用于扰动提示。但是,某些语言模型甚至无法为公众提供。在这项工作中,我们首先探讨了提示和加强学习(RL)与转向模型的生成的组合,而无需访问任何模型的参数。其次,为了减少培训工作并增强对看不见的任务的普遍性,我们应用多任务学习以使模型学会更好地对新任务进行推广。实验结果表明,我们提出的方法可以成功控制几个最新的(SOTA)对话模型,而无需访问其参数。此外,该模型证明了与基线模型更少的步骤快速适应看不见的任务的强大能力。
translated by 谷歌翻译
气道分割对于胸部CT图像分析至关重要。但是,由于固有的复杂树状结构和气道分支的不平衡大小,这仍然是一项具有挑战性的任务。当前的深度学习方法着眼于模型结构设计,而培训策略和损失功能的潜力尚未得到充分探索。因此,我们提出了一个简单而有效的气道分割管道,该管道表示为Naviairway,它发现具有支气管敏感的损失功能和人类视觉启发的迭代训练策略,发现了更细的细支气管。实验结果表明,Naverway的表现优于现有方法,尤其是在识别高产生的细支气管和对新CT扫描的鲁棒性方面。此外,纳维亚威是一般的。它可以与不同的骨干模型结合使用,并显着提高其性能。此外,我们建议对基于深度学习的气道细分方法进行更全面,更公平的评估,以更全面,更公平地评估。 Naveraway可以生成用于导航支气管镜检查的气道路线图,并且在生物医学图像中细分精细和长管结构时,也可以应用于其他情况。该代码可在https://github.com/antonotnawang/naviairway上公开获得。
translated by 谷歌翻译
We aim to bridge the gap between our common-sense few-sample human learning and large-data machine learning. We derive a theory of human-like few-shot learning from von-Neuman-Landauer's principle. modelling human learning is difficult as how people learn varies from one to another. Under commonly accepted definitions, we prove that all human or animal few-shot learning, and major models including Free Energy Principle and Bayesian Program Learning that model such learning, approximate our theory, under Church-Turing thesis. We find that deep generative model like variational autoencoder (VAE) can be used to approximate our theory and perform significantly better than baseline models including deep neural networks, for image recognition, low resource language processing, and character recognition.
translated by 谷歌翻译
We propose a distributionally robust return-risk model for Markov decision processes (MDPs) under risk and reward ambiguity. The proposed model optimizes the weighted average of mean and percentile performances, and it covers the distributionally robust MDPs and the distributionally robust chance-constrained MDPs (both under reward ambiguity) as special cases. By considering that the unknown reward distribution lies in a Wasserstein ambiguity set, we derive the tractable reformulation for our model. In particular, we show that that the return-risk model can also account for risk from uncertain transition kernel when one only seeks deterministic policies, and that a distributionally robust MDP under the percentile criterion can be reformulated as its nominal counterpart at an adjusted risk level. A scalable first-order algorithm is designed to solve large-scale problems, and we demonstrate the advantages of our proposed model and algorithm through numerical experiments.
translated by 谷歌翻译
Many recent works on understanding deep learning try to quantify how much individual data instances influence the optimization and generalization of a model, either by analyzing the behavior of the model during training or by measuring the performance gap of the model when the instance is removed from the dataset. Such approaches reveal characteristics and importance of individual instances, which may provide useful information in diagnosing and improving deep learning. However, most of the existing works on data valuation require actual training of a model, which often demands high-computational cost. In this paper, we provide a training-free data valuation score, called complexity-gap score, which is a data-centric score to quantify the influence of individual instances in generalization of two-layer overparameterized neural networks. The proposed score can quantify irregularity of the instances and measure how much each data instance contributes in the total movement of the network parameters during training. We theoretically analyze and empirically demonstrate the effectiveness of the complexity-gap score in finding 'irregular or mislabeled' data instances, and also provide applications of the score in analyzing datasets and diagnosing training dynamics.
translated by 谷歌翻译
Modern deep neural networks have achieved superhuman performance in tasks from image classification to game play. Surprisingly, these various complex systems with massive amounts of parameters exhibit the same remarkable structural properties in their last-layer features and classifiers across canonical datasets. This phenomenon is known as "Neural Collapse," and it was discovered empirically by Papyan et al. \cite{Papyan20}. Recent papers have theoretically shown the global solutions to the training network problem under a simplified "unconstrained feature model" exhibiting this phenomenon. We take a step further and prove the Neural Collapse occurrence for deep linear network for the popular mean squared error (MSE) and cross entropy (CE) loss. Furthermore, we extend our research to imbalanced data for MSE loss and present the first geometric analysis for Neural Collapse under this setting.
translated by 谷歌翻译